More extensive research is required if community-based measurements of health state valuations are to be used in summary measures of population health status. Apropos, the nature of the valuation function and its characteristics for different health states, we have seen, from test and retest data, that ordinal rankings are not consistent with conventional notion that the individual has a single valued function. Conventional economic theory of well-ordered preferences assumes that an individual has a clear pair wise ordering of alternatives. Our observations in this study suggests that the valuation in a person's mind for a given health state may be a multivalued function. The range of values over which the valuation function is defined is a function of the health state and the extent of cumulative deliberations by the individual. This hypothesis needs to be tested through more studies. There are two research questions here: whether the health state valuation functions are multivalued or single valued and if the health state valuation function is indeed multivalued, which factors determine the range of values (i.e. the image space) it can assume. The incidence/prevalence of the health state, associated taboo, severity level, health state description system, and deliberations by the individual are some factors worth exploring . The extent to which a person has deliberated about a health state may affects the size of the health state valuation image space. Our conjecture is that, upto some point, increased opportunity for deliberations about the value of a health state would narrow down the health state valuation image space. This has important implications for the methodology to be adopted for community measurement of health state valuations. If increased opportunity for deliberations do indeed lead to narrowing of the health state valuation image space, then repeated measurements accompanied by an opportunity for deliberation and reflections on the concerned health state would improve the reliability of measurements. If there are no such relationships, then a larger sample size may be the only means to improve reliability of health state valuation measurements from a community.
So far health state valuation studies have used a set of indicator conditions to obtain valuations from a community and have used various interpolation strategies to assign disability weights to other health states. Such interpolated values are estimates of mean health state values that might have been measured from the community. This study has clearly brought out the reference that health states vary in the extent to which community has crystallised valuations for it. The extent to which community valuations for a health state is diffused or crystallised has important implications for its use in summary measures of population health status. The mean disability weight may be adequate enough for health states with crystallised community valuations. In case community valuation for a health state is diffused, the mean does not have much significance as an input for summary measures of population health state. For such health states, either an uncertainty analysis by multiple simulations or a sensitivity analysis giving endpoint values from the range of valuations would be desirable. In other words, the distribution of valuations in the community is of as much importance as the estimated mean of health state valuations. It is difficult to predict the distribution of all health states from the distribution of indicator conditions. Hence, future research will have to directly measure valuation by communities for all health states. Operationalising such measurements in single studies may be difficult. But appropriate strategies can be found out once we are clear of the need for direct measurements from communities for as many health states as is possible.
Thirdly, more research is required to study the efficacy of health state description systems in reliably communicating the same state to all individuals. This line of research will be more culture-specific. Two areas need attention: the semantic content of statements used to describe the severity levels along each dimension and the validity and reliability of the graphical description system. In this study we have developed a graphical description system. The graphics were chosen from out of about five to six alternatives, by showing the pictures to a convenience sample of persons. It will be useful to study more formally the validity and reliability of the graphical description system in communicating a given health state. Such research projects should include plans for further refinement of the graphics. These studies are important for measurement of health state values in partially literate, as well as multicultural communities. Once we know enough about semantically equivalent statements of severity levels and have equivalent graphics, it should be possible to develop multimedia description systems.
Finally, more studies are required about the nature of relationship between different measurement methods. For example, the VAS and TTO valuations in the study was found to be similar. This is different from earlier findings that TTO generally gives lower disability weights for milder conditions, compared to VAS. Future studies should carefully document details of measurement techniques, measurement context, interviewer and valuer characteristics, so that factors contributing to differences in valuations between the two methods and between sites can be identified.
An important contribution of this study is the advancement of methodological aspects of health state valuation in developing country communities. A health state description system incorporating a graphical description component was developed to facilitate communication in partially literate communities. Some deliberative tools for conduct of health state valuation workshops for educated persons were developed. The experience gained for valuation of health states in developing country settings, we hope, will help in future research.
Apropos the substantive aspect of the subject, this study has shed some light and raised many questions about the nature of the health state valuation process in our minds. Analysis of test and retest data on ordinal ranking of health states, valuation of own health state and differences in distribution of valuations at the community level, all lead us to hypothesise that the true health state valuation in our minds is a multivalued fuzzy set with different degrees of clarification. Conventional theory that the true valuation is a single valued function is not consistent with our observations, and appears intuitively less appealing.
Health state valuation studies will have to contend with the problem of measurement error, as is the case in most other areas of psychometric measurement. Unfortunately, we do not yet have a fully worked out measurement model for health state valuations. Most studies use reliability measures conceived under the classical test theory developed in the context of educational testing and measurement. In the field of educational measurement, it is generally assumed that the object of measurement is distributed normally with some variance. If the variance component attributable to subjects is high, then educational tests are considered reliable. We have seen that community valuation of health states follow different distributions. Health state valuations are not personal endowments that can be assumed to be distributed normally in a fashion similar to, say, intelligence. If community valuation for a health state is well crystallised, then the true variance of subjective valuations will be less as compared to health states where the valuation is more diffused. The generalisability theory allows for a more realistic modelling of the measurement process. Reliability of the health state valuations in this study can be said to be moderate, on the basis of obtained generalisability coefficient (0.56 to 0.67) and conventional reliability measures like ICC (0.6), within valuer correlation (around 0.6 to 0.8) and within valuer ICC (0.6 to 0.8). However, a more appropriate measurement model of health state valuation will help in correct estimation of reliability.
The incidence of measurement error and our present understanding about the nature of valuation process would suggest that community level valuation of health states requires a large sample size as also repeated measurements. Large sample sizes would help minimise the measurement error for mean values estimated form community surveys. Repeated measures, it is anticipated, will occasion repeated deliberation by the valuers and thereby help clarification of their value sets. The tradeoffs between sample size and repeated measurements will have to be studied.
The health state valuation instruments used in this study have good content validity, considering that they have been derived by many people working from similar conceptual definitions of health and are based on empirical listing of health state attributes by some large studies. The criterion validity of health state valuation instruments cannot be tested, since we do not have a good standard for this purpose. The instruments have shown good convergent validity. Measurements from multiple methods like, the visual analogue scale, time tradeoff, and person tradeoff agreed quite well with each other. Incorporation of ordinal rank consistency requirement in valuation tasks appeared to facilitate deliberation.
Ordinal rank consistent visual analogue scaling (VAS) turned out to be a fairly valid tool for the measurement of health state values. The VAS valuations agreed quite well with valuations from other methods like time trade-off and person trade-off. In fact VAS did better in some cases. For example, the incidence of counterintuitive valuations was lower for VAS. Considering its simplicity, and the feasibility for community surveys, VAS appears to be the instrument of choice for measurement of health state valuations.
So far, researchers have focused on the mean valuations. This study has demonstrated that community valuation of all health states do not follow the same distribution. The degree of crystallisation of valuations appears to be health state dependent. Valuations for some states are quite diffused, for example, infertility. Valuations for some others are well crystallised, for example, quadriplegia. Differences in distribution of valuations by the community for different health states has policy implications, and hence, should be the subject of further research. This implies that health state valuation studies using a few indicator conditions will not provide required inputs for summary measures of population health. Data on indicator conditions, allow for statistical decomposition of multiattribute valuations and model based estimation of disability weights for other health states. These statistical models can estimation mean valuations only, but can not provide any information about distribution. The only way to understand distribution of valuations for all health states is to measure valuations for each of them in the concerned community.